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Abstract 

Our paper aims to model and forecast the electricity price by taking a completely new perspective 
on the data. It will be the first approach which is able to combine the insights of market 
structure models with extensive and modern econometric analysis. Instead of directly modeling 
the electricity price as it is usually done in time series or data mining approaches, we model and 
utilize its true source: the sale and purchase curves of the electricity exchange. We will refer 
to this new model as X-Model, as almost every deregulated electricity price is simply the result 
of the intersection of the electricity supply and demand curve at a certain auction. Therefore 
we show an approach to deal with a tremendous amount of auction data, using a subtle data 
processing technique as well as dimension reduction and lasso based estimation methods. We 
incorporate not only several known features, such as seasonal behavior or the impact of other 
processes like renewable energy, but also completely new elaborated stylized facts of the bidding 
structure. Our model is able to capture the non-linear behavior of the electricity price, which 
is especially useful for predicting huge price spikes. Using simulation methods we show how 
to derive prediction intervals for probabilistic forecasting. We describe and show the proposed 
methods for the day-ahead EPEX spot price of Germany and Austria. 

Keywords: electricity price forecasting, supply and demand curves, price spikes, auction data, 
bidding behavior, probabilistic forecasting 


1. Introduction 


In the recent decades modeling electricity prices have become a complex and broad held of 
research. Due to the liberalization of markets and increasing disclosure of data, new insights 
concerning the structure and behavior of the prices were gained. Researchers pointed out that 
there are typical characteristics of electricity prices regardless where it has been traded. These 
are summarized as the stylized facts of electricity prices, see e.g. Weron (2006). One of these 


stylized facts concerns tremendous deviations of the price pattern from its mean, called price 
spikes. This specific feature of electricity prices has huge impacts for research as well as politics 
and companies. Many electricity companies, e.g. in Germany, are obliged to market some of their 
electricity at an exchange, which makes their earnings prone to heavy price spikes and creates a 
complex task for their risk management department. Moreover, many financial contracts such 
as futures or options are dependent on the variance of the price process and therefore demand 
eligible estimation techniques. Also long-term cost calculation for investment projects or political 
programs like the development of renewable energy are dependent on stable and reliable methods 
for calculation of electricity prices, which can account for the likelihood of price spikes. 

Therefore, a great variety of models for estimating the electricity price occurred during the 
past decades. Those models are often related to well-known models of the finance literature but 
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can originate from many other fields of research. Weron (2014) for instance divides electricity 
price models into five different groups, multi-agent, fundamental, reduced-form, statistical and 
computational intelligence models. Besides the multi-agent and fundamental approaches all 
models have in common that they focus on the price itself or related time series like renewable 
energy or electricity demand. Multi-agent models usually focus on the supply and demand of 


electricity to obtain prices by equilibrium, optimization or simulation (Ventosa et al. (2005), Liu 


et al. (2012)), but hence often do not incorporate the time-series of electricity bids and asks of a 


real exchange into their approaches. Fundamental approaches cover a great variety of models but 
mainly emphasize the basic economic and physical relationships of the market (Weron, 2014). 

Concerning price spikes, the distinction between different model approaches can be refined 
when the explicit or implicit incorporation of price spikes is considered. In the area of time 
series models the usage of specific heteroscedastic models for the variance of the process are 
typical (e.g. Bowden and Payne (2008), Liu and Shi (2013)). But standard GARCH-type 


models cannot account for all of the extreme price events within the data (Swider and Weber 


(2007)). Lienee, many researcher developed extended models which can account for severe price 
movements. These models commonly fall into two main categories. First, there are regime¬ 
switching models, which introduce different regimes, usually a base and a spike regime, with 
different probabilities for a price spike to occur (see, for instance Karakatsani and Bunn (2008), 


Janczura and Weron (2012), Eichler and Tuerk (2013)). Second, there are diffusion models, 
which add a jump component, e.g. a Poisson process, to allow for price spikes (see, for instance 


Weron (2008), Escribano et al. (2011)). Rarely there are approaches which focus solely on the 


price spike itself and try to forecast the event without modeling the whole price time series, e.g. 


m 


Christensen et al. (2012). 


However, all of these approaches for modeling price spikes have in common that they are 
focused mainly on the price time series and not of the underlying mechanic which determines 
the price process. The electricity price can also be seen as the intersection between the part of the 
electricity supply and demand which was traded at an exchange. The resulting sale and purchase 
curves, which are also referred to as ask and bid curves or market supply and market demand 
curves, contain all the information which is needed to determine the market price but provide 
even further information on all the other prices for other market volumes. This information 
can be necessary especially for the estimation of the likelihood of extreme price events, as the 
elasticity of the price, which can be obtained from the shape of the sale and purchase curves, 
vastly accounts for price movements. 

But even though a time-series approach for modeling and especially forecasting auction data 
is relatively new and has not been applied for electricity price data in a comprehensible manner, 
modeling the structure of the supply and demand curves in general has been done by some 
authors, even if very little of them do utilize real auction data. Most of these models belong to 
the field of fundamental models, but are also often referred to as structural models, as they try to 
capture the structure of the market. Many of them originate from the field of derivative pricing 
and do not focus on forecasting the electricity price itself and therefore avoid the uncertainties 
which come along with it. Barlow (2002) is one of the first authors in electricity price research 
who formulates a model motivated by real auction data of an electricity market. In his paper he 
uses a non-linear Ornstein-Uhlenbeck process to obtain a realistic image of the true underlying 
price process and is also able to capture extreme price events. In the book of Eydeland and 


Wolyniec (2003) in chapter 7 a basic market model approach which maps the energy supply to the 


price of electricity is introduced. They make use of the structure of the market by constructing 
the so called bid stack, which refers to the marketed aggregated supply of energy for different 
prices and should, in theory, be equivalent to the sale curve at the investigated auction market]^] 


1 We want to point out that the bid stack is not necessarily the same as the energy supply, as the bid stack 
includes also e.g. bidding behavior. For more information on this we suggest to read [Eydeland and Wolyniec| 
((2003) 
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Given the specific cost functions of energy generators they are able to determine the bid stack 
function and afterwards the system price of electricity. Another promising approach arose in 
the working paper of Buzoianu et ah (2005), who model the marketed supply and demand 
curves. They assume a linear demand function and a nonlinear supply function to construct a 
price-quantity model, where the intersection of both curves equals the market clearing price. To 
approximate the market curves they use external factors like temperature, gas energy supply 
and gas price. Boogert and Dupont (2008) use a market structure approach which includes 


the relationship of electricity demand to available capacity to forecast electricity prices and the 
probability of spikes for the Dutch electricity market. Another structural approach can be found 


m 


Howison and Coulon (2009) and Carmona et al. (2013) who perform an analysis of the sale 


and purchase structure and integrate some of its aspects by incorporating the bid stack model. 
Extensions to basic structural models are often done via the introduction of market specific 
determinants, as for instance the solar and wind power feed-in as done by Wagner et ah (2014) 
or C02-emissions as done by Hendricks and Ehrhardt (2013). 

Some of the recent approaches try to capitalize the increasing amount of available data, 
especially the hourly auction data of the EPEX, which allows for a deep analysis of the real 
offered volumes for selling and purchasing electricity. As this results usually in a large amount 
of data and therefore complexity, some researcher tried to simplify the resulting market curves 
by merging them into a new curve with desirable properties. For instance, Eichler et ah (2012) 
illustrate in an extended abstract an idea for modeling the German/Austrian EPEX price using 
the supply/demand curves. They utilize the curves to model a scaled supply and demand 
spread using an autoregressive time series model with weekday effects. Coulon et ah (2014) 


try to overcome the common issue of the assumption of inelastic demands by constructing a 
“price curve” out of the marketed supply and demand curve for the same hour. The resulting 
curve exhibits many well-known typical behavioral attributes, e.g. weekday effects. The price 
curve is then matched with a pseudo-demand curve, which is again a vertical line, where the 
intersection of both results in the market clearing price. A related approach is used by Aneiros 


et ah (2013) for the Spanish electricity market. They consider a functional modelling approach 
for a similar price curve as defined in Coulon et ah (2014), but call it “residual demand curve”. 
However, in electricity price research the term residual demand curve is usually more common 


in the framework of market and bidding behavior (as in Hortacsu and Puller (2008), Vazquez 


et ah (2014) or Portela et ah (2016)). Hildmann et ah (2015) analyze empirically the impact of 


renewables to the real auction data of the EPEX, if they were not subsidized by the government. 
For instance, by manipulating the marketed supply curve accordingly they show that negative 
prices diminish completely when the wind power feed-in is marketed at its true marginal costs. 
A more detailed survey on structural models can be found in Carmona and Coulon (2014). 

All of these papers have in common, that they exhibit at least one of the following major 
drawbacks. They do not incorporate real auction data (e.g. Boogert and Dupont (2008)), they 


assume, that the demand is inelastic and therefore focus only on the bid stack (e.g. Eydeland 


and Wolyniec (2003), Howison and Coulon (2009), Carmona et ah ( 2013[) ) p| they use simplifica¬ 


tions or modulations which skip the important correlation structure between bids (e.g. Buzoianu 


et ah (2005[), Coulon et ah (2014)) or they are not properly adjusted for forecasting real elec¬ 


tricity prices (e.g. Barlow (2002)). Besides electricity price research an econometric time- series 
approach which actually covers the contemporaneous nature of functionally related and time- 
dependent auction data can be found in Bowsher (2004), who applies a functional signal plus 
noise time series model to a security of the FTSE100. 

Our idea aims to fill the gap between research done in time-series analysis, where the structure 
of the market is usually left out and the research done in structural analysis, where empirical 


2 The assumption of an inelastic demand can be justified for some markets. In the case of the electricity market 
for Germany and Austria on the contrary, where a large proportion of trading is done between different energy 
companies on a national and international basis, the assumption of inelastic demand is not realistic. 
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data is utilized very rarely and even less thoroughly. It is especially new in the sense that it gets 
the best of both ends, it will provide deep inside on the bidding behavior of market participants, 
while still remaining a high accuracy in probabilistic forecasting of the market price. We will 
therefore use the true data generating process, e.g. the sale and purchase curves of the electricity 
price, to provide better probabilistic forecasts for extreme price movements while still modeling 
the time series of electricity prices by an autoregressive approach. We will use the hourly day- 
ahead electricity price auction data of Germany and Austria provided by the EPEX Spot, also 
known as Phelix. It will be shown that incorporating the sale and purchase data yields promising 
results for forecasting the likelihood of extreme price events. Within our approach we will be 
able to estimate the full prediction density of electricity prices. 

Our paper is organized as follows. The next section focuses on our idea and will describe 
the data and our observations for the EPEX Spot day-ahead auctions. We will follow up with a 
detailed description of our model and its specific setup for the auction data. Afterwards we show 
the empirical results of our approach. Our last section discusses our findings and will provide 
insights for possible improvements and future research. During the paper we will use the phrase 
“price curves” for both, the sale and purchase curve. Every price will be provided in EUR/MWh 
and every volume in MW, if not specified otherwise. Note that the market clearing volume is 
reported by the EPEX as energy in MWh. As we will only consider hourly data we denote the 
volume in MW. 


2. Price formation process and price curves structure 


The electricity price of exchanges is the result of competitive bidding and offering. Focusing 
merely on the time series of prices therefore neglects their true source. If the true sale and 
purchase curves were known, the price could be solely determined by the intersection of both 
curves - regardless of any time dependencies between different prices. Many authors point out 
that the price is driven by external factors, e.g. wind and solar or electricity demand, see for 
instance Weron (2014). However, taking a closer look on the underlying price process, it can 


be stated that it is the buyers and sellers on an electricity exchange who are influenced by 
those factors and therefore adjust their bids. Reasons for that can be e.g. that these market 
participants are electricity companies who are facing heavy overproduction of electricity due to 
an unexpected change in wind speed or temperature or an underproduction due to outages of 
power plants. 

But those market participants are not equal, they can be investment companies, electricity 
producers or transmission service operators, among others. Also not all electricity producers are 
equal - they have distinct production portfolios and are therefore more or less likely prone to 
e.g. heavy weather conditions. An unexpected shift in wind production levels for instance can 
therefore lead to a little or vast change in prices, dependent on if the equilibrium price of the 
market was already mainly driven by wind producers. This diversified information is summarized 
in the sale and purchase curve of electricity prices. Hence, especially for estimating heavy price 
movements it is essential to know, if the market is capable of adjusting for external shocks easily 
or if a tremendous price spike will occur. This sensitivity of the intersection price can therefore 
be obtained by analyzing the original price curves instead of only their outcome as price time 
series. To motivate our idea even further, we decided on showcasing the day-ahead price of the 
12.04.2015 of the EPEX Spot for Germany and Austria. We will use this day throughout our 
whole paper, as it provides easily traceable insights for the typical price movement process when 
an extreme price spike occurs. 

The left-hand side of Figure [l] shows the day-ahead electricity price of the 12.04.2015. In 
the upper and lower right-hand side the price curves for 12:00 and 13:00 and 19:00 and 20:00 
respectively are provided. The horizontal axis of this area represents the trading volume and the 
vertical axis the price. It is shown that during the afternoon hours the electricity price heavily 
declined reaching even negative values. Examining the price curves for 12:00 that day on the 
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Figure 1: Market clearing price time series with supply and demand curves for 12:00 and 13:00, 
and 19:00 and 20:00 hours resp. on 12.04.2015 auctions. 


upper right-hand side two typical phenomena for such an observation can be seen. First, the 
traded volume is, in comparison to other hours that day, relatively high. Second, the slope of 
the demand curve for any price and volume combination with a lower price than the market 
price is extremely negative. Simultaneously, the slope of the supply curve is extremely positive 
for price and volume combinations with a lower price than the market price - at least for price 
combinations close to the actual market price. Monitoring the left- hand side of the figure shows 
that the price exhibited a tremendous price decline from 12:00 to 13:00. Taking into account 
the phenomena mentioned beforehand gives insights on why such a heavy price spike was even 
possible. The high amount of supplied electricity shifted the price to a level, where usually only 
a relatively small proportion of bids can be found, e.g. the supply and demand curves exhibited 
high “steps” (i.e. in horizontal). Those “steps” result in the second observation of curves having 
extreme negative or positive slopes close to the market price. This in turn indicates, that the 
equilibrium price is very sensitive to external shocks. Any sudden decrease in demand which 
would lead to a left-shift of the demand curve or any sudden increase in production which would 
lead to a right-shift of the supply curve has a great impact on the price - especially in comparison 
to other, higher price levels. But we can also see that the supply curve for 13:00 exhibits a slope 
of almost zero around the intersection price, indicating that any further decrease in demand 
or increase in supply will not have the same vast effects than before. And indeed, the price 
movement from 13:00 to 14:00 was much smaller than the one from 12:00 to 13:00. In contrast, 
the price curves for 19:00 and 20:00 on the lower right-hand side of the figure show the typical 
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behavior of price curves when the market is not very prone to extreme events. The slopes of the 
price curves right from the intersection price seem not to have extremely positive and negative 
slopes respectively. Only the demand curve left of the intersection price seem to have a very 
negative slope, but matching it with the supply curve it can be seen that any shifts to the right 
or left will be captured by the supply curve easily and can therefore not result in a heavy price 
spike. 

Under the assumption that not only the price but also the price curves are dependent on time, 
we will derive a model which is tailor-made for the day-ahead auction market of the EPEX Spot 
in Germany and Austria for the purpose of estimating the likelihood of heavy price movements. 
Therefore and in order to introduce our model we need to take a closer look at the EPEX Spot 
market and the observed bidding structure of their participants. 

For our summary statistics and all computation results up to section 4 we use the data from 
01.10.2012 to 19.04.2015. However, all techniques can be applied to other electricity markets in 
exactly the same way but under considering of their corresponding market features. 

The day-ahead electricity spot price of the EPEX will be traded in daily auctions at 12:00 
CET for the hours of the next day. So there are in general 24 prices everyday. Due to the 
daylight saving time we have once a year 23 values in March and 25 values in October. For the 
24 auctions on a common day we use the labels 0:00, 1:00, ..., 23:00 within this paper. 

Since 2008 the electricity spot price is set to be between P min = —500 and P max = 3000. 
Before that there were no negative prices allowed. The traders at the EPEX can make bids 
for either selling or buying a certain amount of electric power. By the EPEX regulations, the 
minimal order size for Germany and Austria is 0.1 MW for a one hour block and the minimal 
price difference between different orders is 0.1 EUR/MWh. Hence, there are in total 35001 
different possible prices on the full price grid P = {—500,—499.9,..., 2999.9, 3000}. But in 
practice not every of those possible prices is utilized. Also, a single trader can only submit up 
to 256 distinct price and volume combinations as offers. Usually there are about 700 different 
featured prices which construct each curve, which we depicted in Figure |2j The illustration 
shows the histogram for the amount of different prices for both, the demand and the supply 
curve of electricity at the EPEX Spot. It can be seen that the range of different prices covers 
approximately 200 to 1000, depending on which side of the market is considered. In general we 
have slightly more different prices for the supply side. In total there were about 31000 bids on 
distinct prices within the considered period. 

^t- 



400 600 800 1000 


Different bid prices per auction 

Figure 2: Amount of different bid prices for the auctions. 

Moreover, there are other order types allowed, e.g. standard block orders, linked block orders 
and exclusive block orders. The underlying market coupling algorithm to derive the market 
clearing price from all submitted bids is very complex. Since February 2014 the EUPHEMIA 
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(acronym of Pan-European Hybrid Electricity Market Integration Algorithm) is used to compute 
the market clearing price and volume by maximizing the total welfare. This involves a market 
coupling of several European markets like the EPEX, APX, Nordpool and OMIE. However, the 
EPEX provides only the sale and purchase curves as illustrated in Figure [lj not every single 
bid. Solely this dataset is used for our empirical supply and demand analysis. Thus we assume 
indirectly that, given a certain hour, all underlying bids are standard bids (single 1-hour block). 
This implies that we neglect the specific impact of potential complex bids like block orders. 

As mentioned before, most of the prices in the price grid P where not bid at all. Nevertheless, 
given the price grid P we can explore the bid supply and demand volumes Us,*(-P) and Vo,t(P ) at 
price P E P and time t. Later on we will see that especially the prices with an actual bid at time 
t play an important role in the price coupling algorithm for the shape of the sale and purchase 
curves. Therefore we introduce with P $,t and P o,t the bid prices on the supply and demand side 
at time point t. Obviously they are defined to be all prices with a positive bid volume 

P s,t = {Pe P| V s ,t{P) > 0} and P At = {P E P| V D , t (P) > 0}. (1) 

When the bids Vg it and Vb.t are aggregated the well-known price curves for a certain hour 
can be constructed, which maps a certain amount of supply or demand to a certain price. The 
aggregated supply and demand volumes Vs,t(P ) with P E P s,t and Vd,*(-P) with P o,t match 
exactly the corresponding points at the sale and purchase curve illustrated in the Figure [TJ 
Mathematically precise the sale and purchase curves are characterized by 

St(P) = V sAp) for P E P S)t and D t {P) = V D . t (p) for P E F D>t . (2) 

P&S,t P&D,t 

p<P p>P 

However, equation (J2]) defines the supply curve explicitly only on the price grids P s,t and Pd,*. 
As mentioned, according to the operational rules of the EPEX the market clearing price is 
determined by the EUPHEMIA algorithm which involves complex orders as well. Nevertheless, it 
is assumed by the EPEX that the relation of two different bid price and quantity combinations of 
one market participant is linear. Therefore and to simplify the used algorithm we will use linear 
interpolation between to different price and quantity combinations given by {(S t (P), P)\P E 
Ps )t } and {(D t (P), P)\P E Pd,*} for the supply and demand. The market clearing price will 
be calculated by the resulting intersection of both price curves, rounded to two decimal places. 
Consequently there is sometimes a small but rather negligible difference to the true market price. 
In particular in 64% of all cases this price matches the market coupling price, in 89% of all cases 
the difference is less than 0.1 EUR/MWh which is the smallest bidding unit and in 99.8% this 
difference is less than 1 EUR/MWh. This fact is managed by certain rules for the traders, so 
that the amount of volume of the market clearing price must be delivered. 

To understand the characteristics of the bid volumes Us,*(P ) and Vn,t(P ) better we ignore 
the time dependency in a first step. We evaluate the mean bid volumes 

T T 

= and (3) 

*=i t =i 

for the supply and demand side given a P E P and the number of observations T across all hours 
in the database. Figure [ 3 ] shows the average bid volume of supply and demand volumes Vs(P) 
and Vd(P) with P E P within the price range of -20 to 100. Additionally we highlighted the 
realized amount. We can observe that for the supply side almost all bid low prices were realized 
whereas for high prices only a few were. This relation is reversed for the demand side. We also 
observe some patterns in the bidding behavior of some traders. For example, we have quite large 
volumes at a price of 0, but very small amounts for e.g. 0.3 or -0.3. Furthermore we have spikes 
at multiples of 5, so e.g. 70.0 has a larger value than 69.0 or 71.0. This shows that agents seem 
to prefer to bid on round numbers. This may give indication towards the assumption that at 
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Figure 3: Average bid volumes Vg(P) and Vd{P) for P G P in the price range -20 to 100. 


least a noticeable amount of trading is done by human decision and not based on algorithmic 
trading rules. 

The large bid volumes at a certain price leads to a price cluster around this price. The most 
intense price cluster can be found at a value of zero, which can be retrieved from Figure [3} Even 
in the small time frame shown in Figure [l] there are four realized electricity price values very 
close to zero. For the auction at 12:00 we see that the high possibility for a price of zero is mainly 
driven by the supply side. Figure [l] also shows that for the auction at 12:00 there is another 
price cluster at -65, which is again driven by the supply side. And again, for the realized price 
we can observe two values very close to -65, e.g. the realized price for 13:00 is exactly -65.02. 
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Figure 4: Histogram of realized prices (market clearing prices) in the price range -20 to 100. 


In Figure [4] we see the histogram of all market clearing prices. To visualize the price cluster 
effect it is important to choose tiny histogram bins. Here every tiny rectangle represents about 
0.33% of the probability mass. We observe a very spiky histogram that exhibits some properties 
of the bid prices in Figure [3j For example we see clearly a price cluster at zero. Here the relative 
frequency that the market clearing price is between -0.5 and 0.5 is with 0.634% relatively large. 
In contrast, the relative frequency to get a price in the neighboring intervals of the same size 
from -1.5 to -0.5 and 0.5 to 1.5 is only 0.079% and 0.056%, so about 10 times smaller. Other 
price clusters can again be found at all full integers between 10 and 60, where those clusters that 
are divisible by 5 are more distinct. In general the density of the electricity price is complicated 
due to its multi-modal shape. The modi are at the mentioned price clusters. As far as we know 
there is no model in electricity price modeling that at least tries to capture this behavior. In 





































































































contrast to that, our modeling approach will incorporate this effect and thus try to capture the 
true market behavior more realistically. 


3. Model for the supply and demand curve 


Modeling the supply and demand curve of electricity prices is a very complex task. Researcher 
who try to analyze the complex bidding structure of the supply and demand at electricity 


exchange usually utilize multi-agent models or fundamental models (Weron (2014)). But those 
approaches do rarely take into account the real time series of auction data and are therefore 
unsuitable for giving practical information on short-term forecasts of the electricity price time 
series. This is especially interesting when taking a closer look at the price curves over time. 
In Figure [5] we show the time series of both price curves from 13.04.2015 to 19.04.2015 in a 
three-dimensional plot. To put emphasis on the price scale, which is presented at the y-axis, we 
added a colored legend for them which can be found in the lower two pictures of the figure. The 
upper two pictures show the price curves on the full price grid, whereas the the two pictures in 
the middle focus on the price range close to the market clearing price. Judging only by the figure 
it can be obtained that both, the supply and the demand curve, exhibit a seasonal pattern over 
time with at least daily dependence. 

However, to our knowledge there is not a single paper for the electricity market which actually 
models real price curves and uses them directly to forecast real electricity price time-series by an 
econometric approach. Therefore the following sections will describe the necessary setup for such 
a model highly detailed and use references whenever our model idea makes use of a well-known 
econometric technique. 

For our model we proceed in three steps: 


1. To overcome the massive amount of data we will organize the bid volumes in price classes. 
This will be discussed in Section 13.11 


2. We provide a stochastic model to forecast the bid volume of each price class. Section |3.2 
will cover this step. 


3. Given the forecasted bids within each price class we reassemble the precise bidding struc¬ 
ture by reconstructing the classes. Then we calculate the supply and demand curves to 
compute the market clearing price by the intersection of both curves. This will be explained 
in Section 13731 


We will refer to our model for the sale and purchase curves of the electricity price as X-Model 
throughout the paper. We choose the letter X, as it symbolizes visually the intersection of the 
supply and demand curve. 


3.1. Price classes for bids 

As mentioned there are 35001 possible volumes on the full price grid P. Theoretically we 
could model each of these processes but this is almost unfeasible due to computational burdens. 
Therefore we show how to choose and apply a simple dimension reduction procedure to the 
price formation process that is computational manageable and still balances the related loss of 
information. Therefore, we merge the 35001 prices in P into a smaller amount of classes. For 
the bids within a price class we will assume later on that they behave similarly over time. 

For creating the price classes we consider the mean bid volume Vs(P) and Vd(P) at price 
P as defined in equation (J3]) . We use them as a measure of the importance for the price P 
for the supply and demand side of trading. Similarly to the dehniton of the price curves in 
equation ([2]), we define the mean supply and demand curves S and D. They are characterized 
by accumulating the mean bid volumes from equation (J3]) 

S(P) = X v s (p) for PePs and D(P) = v d(p) for P eP D (4) 

pePs per D 

p<p p>p 
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(a) Supply from -500 to 3000 in EUR/MWh (b) Demand from -500 to 3000 in EUR/MWh 




(c) Supply from -20 to 100 EUR/MWh 



Price in EUR/MWh 

(e) Legend from -500 to 3000 in EUR/MWh 


(d) Demand from -20 to 100 EUR/MWh 
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Figure 5: 3d price curves from 13. April 2015 to 19. April 2015 in a price color plot with color 
legends. The colored circles represent the market clearing price and volume, the color matches 
the price in the color legends. 


where P 5 = UL Vs,t and P D = \Jl =1 P D,t are the sets of all bid prices for the supply and demand 
side. As in (| 2 ]) the complete mean supply and demand curve is given by the linear interpolation 
of the characterized points of Q. The resulting mean supply and demand curve is given in 
Figure^ Note that the corresponding mean supply and demand functions S and D on the price 
grid P are monotonically increasing. Therefore we can use the inverse functions S and D 
for the creation of the price classes. 

For creating the price classes we additionally require an amount of volume K which will give 
the average amount of volume that should be represented by every price class. Then we define 
an equidistant volume grid V* = {iV*\i G N}. Using V* and the inverse supply and demand S 
and D we define the upper and lower values of the price classes by 

C s = A _ 1 (V*) = {S~\iV*)\i E N} and C D = D~\v*) = {D~\iV*)\i e N}. (5) 

Figure [ 6 ] visualizes the classifying procedure for a volume of V* = 1000. For our modeling 
approach later on we decided to stick with a volume size for classifying of 1000 , as it provides 
us a manageable size of classes to estimate. However, other amounts of volume are definitely 
plausible. Given our data we receive a total of Ms = 16 and Md = 16 classes for the supply 
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(a) Mean supply on range -500 EUR/MWh to 3000 (b) Mean demand on range -500 EUR/MWh to 
EUR/MWh 3000 EUR/MWh 




Volume in MW 


(c) Mean supply on range -20 EUR/MWh to 100 (d) Mean demand on range -20 EUR/MWh to 100 
EUR/MWh EUR/MWh 


Figure 6 : Mean supply and demand curves S and D on two selected price ranges with volume 
grid V*, price class bounds C 5 and C D for V* = 1000. 


side and demand size. The collection of the price class bounds C 5 and Cp which represent the 
price classes are given in Table [l] Note that for supply price classes c 6 Cj the price class is 

price class bounds 

C 5 -500, -103.9, -55.1, 1.3, 19.5, 27.5, 31.3, 36.2, 42.4, 49.2, 58.0, 72.2, 225.0, 950.0, 2883.0, 3000 
C D 3000, 499.9, 157.4, 52.6, 37.3, 30.9, 28.0, 24.5, 17.8, 13.8, 11.2, 8.4, 0.0, -10.7, -200.0, -500 

Table 1: Price class bounds in C 5 for the supply and C d and for the demand 

always represented by the upper bound c, whereas for demand price classes c € Cd the price 
class is always represented by the lower bound c. The price classes P 5 (c) for an price class 
upper bound c G C 5 and P_d(c) for an price class lower bound c € Cd are given by 

Pg(c) = {P G P| P > maxjp G Cs|p < c}, P < min{p G Cs\p > c}} , 

Pd(c) = {P G P| P > rnaxjp G C D \p < c},P < min{p G C D \p > c}} . 

Here P 5 (c) and P_d(c) are all prices that belong to the same price class as c. This means for 
instance that Ps/—500) = {—500} and P^(—103.9) = {—499.9, —499.8,..., —103.9}. As c G Cg 
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and c G Cg> uniquely describe the price classes Pg(c) and Pd(c) we can take c as the price class 
representative and refer to Cg and Cd as price classes even though they are only collections of 
price class bounds. 

Moreover, the associated volumes at time t to the prices classes Cg and Cp are given by 

vg= J2 W<(P) for ceCs 

PeP s (c) 

X d\= Y Vd ^ P ) for ceCd - 

PeP D (c) 

ffence X^ 500 ' ) gives the amount of volume bid on the supply side at exactly -500 at time t and 

4 t 103 ' 9) the amount between inclusively -499.9 and -103.9 at time t. 

As an example we show several bid volume processes of selected price classes in Figure [7] 
in a short time perspective. Note that the illustrated bid volumes X^ 500 '* and X^® 00 ' 1 are very 
important in practice, as they represent large volumes. Moreover, both bids are favored by some 
market participants as they will always be realized. The corresponding bid volumes are covering 
the price inelastic supply or demand and are also known as the must-run stack. However, these 
volumes are not only covering the must-run bids but also the net import and export positions 
and specific block orders like limit orders. We observe that they have a more distinct seasonal 
structure than the common bids. Due to the way we construct our classes, A^”) 500 '* and X^ 00 - 1 
are independent of the choice of volume V*. 

Every other bid volume processes, e.g. XgJ' 5 \ ^sj' 0> and -A^J' 3 \ in Figure 

a more complex structure. But many of them exhibit also a daily and weekly seasonal 
In the online appendix we provide time series plots as in Figure [7] for all bid volume processes 
for the full time range. 

As mentioned A^ t 500 '* and AT^ 00 "* represent large volumes. The other processes cover ap¬ 
proximately a volume of 1000, but there is an exception as well. Because of the construction of 
the price classes using the equidistant volume grid the last classes X^ 00 ^ and X^ 00 ^ tend to 
cover smaller volumes. However, as the corresponding bids are hardly realized the influence is 
negligible. 


7| has a 
oattern. 


3.2. Time series model for bid classes 

Now we provide a model for the bid volume process Xgj and X^\ of the price classes Pg(c) 

and Pd(c). Therefore, we introduce X P d h and X^ dh as the bid supply and demand volume of 
price class c G Cg resp. c G at day d and hour h. For the well-known issue with the clock 
change due to daylight saving time we decided to interpolate the missing hour in March with 
the two hours around the missing hour and use the average of the double hours in October so 
that there are 24 observable prices each day. Thus, the volume processes X Sdh and X Sdh are 
well defined. 

As mentioned above the processes Xgf^ and X^ d jJ play an important role. But we also 
consider the impact of other possible sources that might influence the bidding behavior. In 
particular we use the EPEX market clearing price and volume of Germany and Austria of 
previous auctions, the planned electric power generation in Germany of conventional power 
plants with more than 100 MW power as well as the planned wind and solar power feed-in. The 
last three processes are provided by the EEX transparency database. Hence, we assume that 
market participants have access to this database or similar information and base their bids at 
least partially on those time series. Especially the impact of wind and solar energy on electricity 
prices due to the merit-order effect is well known (see e.g. Hirth (2013), Cludius et al. (2014) 
or Ketterer (2014)). Therefore we introduce Mx = 5 additional processes denoted by X price ^, 


Afyoiume,*, Xg enerat i ori] i, Ar win d,t Xgoiar^ that represent the additional information that is available 
at the time where the auction will take place. A sample of the considered processes is given in 
Figure [8] 
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2015. 
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(e) with bids in [49.3, 58] (f) X with bids in [37.3, 52.5] 

7: Bid volumes of certain price classes for supply and demand for four observed weeks in 


(c) (c) 

Similarly to X Sdh and X Ddh we introduce the slightly transformed processes X piice ^,h, 
-^volume, d,hi -^-generation, d,hi -^-wind ,d,h and X solar.r/./;. &t day d and hour h. Note that the planned 
generation as well as the projected wind and solar power is known for one day in advance so we 
can use e.g. X solar4+ljh to predict X^ d+l h and X^ d+l h . 

The considered model is a simple regression approach and similar to the basic autoregressive 


model as used in Weron and Misiorek (2008), Maciejowska et al. (2016) or Zicl (2016a) for mod¬ 


eling the electricity price. But we will use it in a more flexible way for the bid volume processes 


of the price classes. For example, Weron and Misiorek (2008) allow for a linear dependency of 


-^-price,d,h fo -^-price,d— l,hi -^-price,d— 2,h cUld X, 


price,d— 7 ,h as well as dummies on Sunday, Monday and 
Saturday. However, the choice of lags 1, 2 and 7 as well as the selection of the weekday dummies 
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Figure 8. Sample of A.p r ; cei £, A vo i ume] f, Ag enera t;i on ^, ■^■wind,f ^-soiar,t from 29.03.2015 to 11.04.2015. 


is the same for all 24 hours. As in Ziel (2016a), we will allow for much more flexibility in the 
model, as the structure of the data is far more complex. This applies to both, the autoregressive 
lag structure and the weekday structure. 

In Figure J 9 ] the weekly sample mean of the bid volume processes Xg~ d 5 ^ and X^f^ for our 
full sample time range is given. There we can see that the daily seasonal structure seems to 




(a) Supply X^J° 0) 


(b) Demand X 


Figure 9: Weekly mean bid volumes for each day of the week and hour of the day. 


depend on the day of the week, as it is typical for the electricity market clearing price or the 
electricity load (see e.g. Ziel et al. (2015a)). So we see that the Saturdays and Sundays have a 
clearly different behavior than the other weekdays. But from 0:00 to 6:00 the Saturday seems 
to leave the typical pattern of a Sunday. Furthermore, we recognize that for the demand side 
the hours from 8:00 to 19:00 are clearly on a higher level during the working days. This is 
interesting, as it exactly matches the peakload standard block order at EPEX. 


14 




















For modeling the day of the week impact we define the weekday indicators 


W k (d) 


1 , W(d) < k 
0 , W{d) > k 


where W(d) is a function that gives a number that corresponds to the weekday of day d. We use 
without loss of generality k — 1 for a Monday, for a Tuesday k — 2 up to k — 7 for a Sunday. 
To fully present the considered time series model, it is necessary to introduce the object 


X d ,h — (Xi t d,hi ■ ■ ■ i XM,d,h)' 

( (.Xs d,h ) c SCs j j-) d /j)c6Cd 1 Xprice,d,ht X volume, d,h, -^generation,d+l,/i, X wind ,r/+1.//• X solar,d+1 Jr) • 

As the planned processes (generation, wind and solar) are known one day in advance they 
are represented with the day d + 1 in the object X^. Note that the dimension of X d ,h is 
M = Ms + M d + Mx ( M = 16 + 16 + 5 = 37 given the used data). However, we have to model 
and forecast only the first Ms + Mr, components for each hour h which exactly match the bid 
volume processes of the supply and demand price classes. Moreover, we do not impose a time 
series model to X dh directly, but to its zero mean process Y dh = X di h~ M/i with = E (X d ,h)- 
We estimate the mean ii h by the corresponding sample mean. 

Now for each hour h the considered time series model of Y dh — (W d ,h, ■ ■ ■, YM,d,h)' is con¬ 
structed that it can potentially depend linearly on Y d -k,h, but also on a different hour Yd-k,j 
with j ^ h and the introduced weekday dummies. The considered time series model for Y m ^h 
for each hour h and m G {1 ,..., Mg + Mp} is given by 

M 24 7 

Yfn,,d,h EE E j,k^ l,d—k,j T ^ ] '0m,fe,fchFfc((j) + £m,d,h (6) 

1=1 j =1 k£X mh (l,j) k =2 


with parameters < t>m,h,i,j,k an d 'i^m.,h,k, 'Y m ,h{l,j) as lag sets of lags and £ m> d,h as error term. We 
assume that the error process {£ m)d ,h)d £z is i.i.d. with constant variance cr^ nh - The introduced 
parameters cf> m ,h,i,j, k will model the linear autoregressive impact and i]) m ,h,k the day of the week 
effect. 

The choice of lag sets I m ,h{L j) in ([6]) is crucial for the full model, as they specify the 
possible model structure. In general it holds true that larger sets T m r{l,j) increase the likelihood 
of overfitting, even though this likelihood is limited due to our used regularized estimation 
technique. However, if the lag sets were chosen too small, we might miss important features in 
the data. Thus, we should always choose of reasonable size. This size is determined 

by the user and can be chosen freely or be backed up by fundamental data analysis, e.g. the 
correlation structure. Please note that this procedure only determines the possible lag structure 
and not the final lag structure as it only defines the set of lags which our estimation algorithm 
will consider. The coefficients that correspond to these lags can have zero impact because of the 
estimation procedure. For this paper we decided on 


{ {1, 2,..., 36} ,m — l and h — j 
{ 1 , 2 ,..., 8} , (m = l and h^j)or(m^l and h = j) 

{1} , rn yi l and h ^ j 

for every bid volume process of price class m. Thus, the process Y m d .h, of price class m at day d 
and hour h can depend on the values of the past 36 days of price class m at hour h. In contrast, 
Y m , d ,h for a specific price class m and a specific hour h is only allowed to depend on the value of 
another process at another hour one with a maximum lag of 1. In all other cases a maximum lag 
of eight is possible. To illustrate this setting Figure [TO] shows the possible dependency structure 
of Y m d } h for an exemplary price class m for an hour h — 2. The left hand side of the figure 


15 


other class 


planned class 


day target price class (h— 2) 


d 

d — 1 
d- 2 

d — 7 
d-8 
d — 9 

d- 36 


0 

1 

2 

3 

0 

1 

2 

3 

0 

1 

2 

3 


0 

1 

2 

3 

0 

1 

2 

3 

0 

1 

2 

3 


0 

1 

2 

3 


22 

23 

22 

23 

22 

23 


22 

23 

22 

23 

22 

23 


22 


23 


hour 


0 

1 

2 

3 • 

ZZ] 22 

23 

0 

1 

2 

3 • 

H| 22 

23 

0 

1 

2 

3 • 

ZZ] 22 

23 


0 

1 

2 

3 • 

ZZ] 22 

23 

0 

1 

2 

3 • 

ZZ] 22 

23 

0 

1 

2 

3 • 

ZZ] 22 

23 


0 

1 

2 

3 • 

ZZ] 22 

23 


hour 


0 

1 

2 

3 

• ••|22 

23 

0 

1 

2 

3 

•••|22 

23 

0 

1 

2 

3 

••• |22 

23 


0 

1 

2 

3 

• ••|22 

23 

0 

1 

2 

3 

•••|22 

23 

0 

1 

2 

3 

• ••|22 

23 


0 

1 

2 

3 

•••|22 

23 


hour 


target 

response 



possible 

dependency 


X 


no possible 
dependency 


X 


not 

available 


Figure 10: Illustration of the dependency structure for a target bid volume process of a price 
class at hour 2. 


shows a specific price class m, called target price class. The blue rectangle symbolizes the hour 
which was modeled, e.g. the hour 2:00. Every green rectangle gives information, if this lag is 
considered for modeling that hour. Red rectangles indicate that the lag is not considered as it 
lays outside of our lag definition, gray rectangles indicate that this data is not available as it 
is future information. Therefore, the target price class for one hour can be dependent on every 
other hour for that same price class up to eight days back in time, and on the same hour for the 
same price class up to 36 days back in time. The possible dependencies on other price classes 
is provided in the illustration in the middle. The allowed dependencies on planned regressors 
(generation, wind and solar) is depicted on the right hand side of the figure. The color scheme 
applies as well to those classes. It is worth mentioning that the model for hour 2:00 of a specihc 
day and price class can be dependent on other hours of the regressors for planned generation of 
that same day, as those information is indeed available before the auction starts. Besides that 
it can only depend on their lagged values for exactly hour 2:00 of the previous seven days. The 
shift in dependence on historical values is due to our definition of the regressors. 

For the estimation of the parameters in (|6j) we use a method of high-dimensional statistics, 


namely the lasso estimation procedure, introduced by Tibshirani (1996). Recently it was also 


used in a context of electricity price forecasting by Ziel et al. (2015a), Ludwig et al. (2015), Ziel 
(2016a) and Gaillard et al. (2016). 


The lasso estimator is a penalized least square estimator, thus we require the regression 
representation of model (§. Therefore, let the multivariate ordinary least squares representation 
of ([6]) be: 


Y m ,d,h ^ m,d,hfd m h T £m,d,hi 


(7) 


with the p m ,fe-dimensional vector of regressors X m4 ,h = (&m,d,h,i> ■ ■ ■, ^m,d,h, Pm ,J' and f3 m h = 
(f3 m ,h,ii ■ ■ • j fim.h.p.,,, h )' as Pm,fe-dimensional parameter vector. For the considered lasso estimation 
procedure it is also important that the regressors in (|7j) are standardized. Thus, we introduce 
with 


X m,d,h/3 m fo T £m,d,h) 


( 8 ) 


the standardized version of equation ([7]). Here Y m ,d,h and the elements of X m ,d,h are scaled so 
that they have a variance of 1. Given the scaled parameter vector /3 m h we can easily reproduce 
(3 m h by rescaling. We estimate the scaled parameter vectors /3 mh given n observable days by 
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using the lasso estimator (3 mh : 


n 


Pm,h = argmin \jY mAh 

d =1 


X. 


m,d,h. 


py 


Pm,h 

+ A m ,h ^ ^ |/3j 

J=1 


(9) 


where A m ,h > 0 is a penalty parameter. Note that for A= 0 we receive the common ordinary 

least square estimator and for sufficiently large Awe get /3 m h = 0. In general, the lasso 
estimator is a biased estimator. However, under certain regularity conditions the lasso estimator 
is consistent and asymptotic normal for the non-zero parameter components. For example, if 
we impose stationarity to the underlying process Y m ^h we arrive at the mentioned asymptotic 
properties. Still, even if the process is heteroscedastic or only periodically stationary we can 


achieve the same asymptotic results, see e.g. Ziel (2016b). But in general, it holds roughly spo¬ 


ken, the more the stationarity assumption of process is violated and the stronger the correlation 
structure in the process the worse the convergence behavior of the lasso estimator. For more 


theoretical and applied details on the lasso estimator we suggest 

Hastie et al. 

(2015 

)• 

As estimation algorithm we 

use the coordinate descent approach of 

Friedman et al. 


which is a fast estimation procedure. For implementation we use the R package glmnet, see 
Friedman et al. (2010). The algorithm solves the lasso problem on a given grid A m /j of \ m)h 


values. This grid A m ,h is usually chosen to be exponential decaying. Given a grid A. m ,h, we select 
our optimal tuning parameter \ m> h by minimizing the popular Baysian information criteria (BIC) 
which performs conservative model selection. However, the tuning parameter could be chosen by 
another information criteria. Cross-validation techniques or test based approaches as introduced 
in Lockhart et ah (2014) might be plausible as well. 

Given the estimated parameters /3 mh for f3 mh , we calculate the lasso estimator (3 m h for /3 mh 
by rescaling. With (3 m h which contains the estimates for 4> m ,h,i,j,k and ifjm,h,k we can compute a 
day-ahead point forecast by 


Yr 


M 24 

m,n-\-l,h XX X ' 

1=1 j = 1 


'rn.hJ..j.kYlji-L- \ —kj 


y ' / lpm,h,k^Yk(ri T 1). 


k =2 


If we have the predicted values Lqn+q/i,..., YM s +M D ,n+i,h we obtain the bid volume forecasts 
Xi <n+ i'h ,..., X M ^ +Mn ,n+i,h by adding the sample means. Using residual based bootstrap as in 


Ziel and Liu (2016) we can compute B bootstrap samples X ^ n+1 h for b G {1,..., B}. To capture 
the correlation structure of the residuals adequately we sample from the residual vector e d>h = 
(Gl ,d,h, ■ ■ ■, £M s +M D ,d,h)' only over the days d. So, if £ d = (e d ,o, • • •, £^, 23 )' denotes the daily residual 
vector for a day we sample from (£ 1 ,... , £ n ). This guarantees that the residual correlation 
structure within the 24 single auctions is preserved. The B bootstrap samples together with the 
reconstruction scheme described in the next subsection are used to receive probabilistic forecasts 
for X mtTl+ i }h . 


3.3. Reconstructing bids and price curves 

After computing the forecast X m n+ i h for each class m G {1,..., M s + M D } and hour h we 
model the apportionment of the forecasted bid volume X m . n+ \ di for each price class. This will 
be useful for computing both point and probabilistic forecasts of the price curves. Especially, 
for the probabilistic forecast it is important to understand the bidding structure within a price 
class as we can use it for simulation methods. However, for forecasting the overall behavior, e.g. 
if we just want to see if there is a large probability for high prices, the reconstruction of the bids 
is not that relevant. We show that the reconstruction of the bids is relevant for the local price 
behavior, especially to explain price clustering. 

For example, if we forecast the sale volume of the price class ranging from -55.0 EUR/MWh 
to 1.3 EUR/MWh to be 1000 MW, we have to redistribute this volume over the different price 
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levels within that class, e.g. -55.0, -54.9, ..., 1.3, so that the real bidding behavior is captured 
well. In this example due to price clustering it is very likely that a significant amount of the 
1000 MW is bid at 0.0 EUR/MWh, as already explained in the previous section. Furthermore 
we have to take into account that many prices are not bid at all. This is important because of 
the considered linear interpolation method for creating the price curves. So even a tiny bid of 
0.1 MW can have a relatively big impact on the electricity price. This holds for both, bids on 
the supply and demand side. As this procedure is crucial for our analysis, we briefly discuss this 
issue in a toy example for a minor change in the supply bidding structure. Therefore we consider 
two scenarios, A and B, for the supply curve and keep the demand curve constant. The scenario 
A differs only marginal from the scenario B in the bidding structure. In A there are 100 MW 
offered at 10 EUR/MWh, whereas in B 99.9 MW were bid at 10 EUR/MWh and 0.1 MW are 
bid at 9.9 EUR/MWh. The detailed assumed bids and the corresponding price curves are given 
in Figure [TlJ There we can observe that the supply curve in scenario B looks more rectangular. 



Supply, Scenario A 

Price -500 -10 0 lRO 20 3000 

Volume 1000 20 50 200.0 50 70 


Supply, Scenario B 

Price -500 -10 0 9.9 KhO 20 3000 

Volume 1000 20 50 0.1 199.9 50 70 


I )pTT| orirl 

Price 3000 22 10 0 U0 AGO 

Volume 1000 10 50 50 200 20 


(a) Supply and Demand curves with the corre- (b) Bidding structure of the supply and demand 
sponding market clearing prices. curves. 


Figure 11: Toy example for two supply scenarios A and B 


Indeed, also judging by our dataset market participants on the sale and purchase side seem to 
aim for a price curve that is close to a step function. In our short example, scenario A results 
in a market clearing price of 1.60 at a volume of 1102.0, whereas the market clearing price in 
B is 7.98 at a volume of 1070.1. The market clearing price is 6.38 EUR/MWh higher. This 
shows exemplarily, that minor change in the bidding structure can cause a severe price change, 
especially in price areas with only some bids, e.g. very large or very small (negative) prices. 
Even though this is a toy example it is surprisingly real. We can assume that the described 
behavior is known by at least some market participants, as we can observe that some agents try 
to strategically chose their bids to achieve the rectangular shape of the function. 

To take into account whether a certain price is traded or not, we have to model the probability 
of that event. We will refer to this approach as “reconstructing” throughout our paper. For 
reconstructed objects, we will use the accent". Remember that Vy t (P) and Vs jt (P) denote the 
bid volume for the supply and demand at price P G P at time t. Similarly as for the bid classes 
and X { 'n dhl we introduce the hour-day transformation Vs^h(P) and Vs,d,h(P) of Vs,t(P) 

and Vs,t(P) that handles the clock change. We can express the bid volumes X^ dh and X^ dh 
of the price classes by 

*(c) SAk = U V SAh (P) and X^ dh = V V W (P), 

P& s (c) P& D (c) 

the sum of the bid volumes Vs t d,h. and of the prices within the price classes. However, after 

the price class forecasting we only have bid volumes X Sdh and X Ddh available to derive the 
price bids Vs t d,h(P ) and VD,d,h{P ) for all prices P G P. Therefore, we introduce the reconstructed 
bid volumes Vs,d,h(P ) and Vn,d,h(P) at price P G P for the supply and demand side. The 
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reconstructed volumes Vs,d,h(P ) and VD,d,h(P ) should be as close as possible to the true bids 
Vs, d , h (P) and V DAh (P) for all P E P. 

Let ns,d,h{P ) and 7r D,d,h(P) be the probabilities that Vs,d,h(P ) and Vo,d,h{P) respectively is 
greater than zero, so there is actually a bid at this price. We assume these probabilities for 
the bids are constant over time. We simply estimate vr s,d,h(P) and 7r D,d,h{P) by the relative 
frequencies TTs,d,h{P ) and nn,d,h(P) hi the given sample. 

Furthermore, we assume proportionality within the bid prices in the price classes with respect 
to the mean volume Vs{P) and Vd(P)- Then we can express the reconstructed volumed Vs,d,h(P) 
and V D ,d,h(P ) by 


V s ,d, h (P) 

V D ,d,h( p ) 


Rs(P)V s (P ) ( C) 

• — - - - ^ S,d,fP 


T. Q9s(a Rsm's(Q) 


Rd(P)V d (P) 

— ^ D,d,h 


T,q^o) R d(Q)Vd(Q) 


( 10 ) 

( 11 ) 


where c is the price class of C 5 or Cd associated with price P E P and Rs(P ) Ber(vr SAh (P)) as 
well as Rd(P ) ~ Ber(7r D,d,h(P)) are Bernoulli random variables with probabilities TTs,d,h{P) and 
7r D,d,h{P)■ We assume that the Bernoulli random variables Rs(P) and Rd(P) are independent 
from each other over the full price grid. Furthermore, we assume that they are independent 
from the error term £d,h of the time series model in ([6]) as well. 

As we have estimates for the probabilities of the Bernoulli random variables 7 rs,d,h and 7r o,d,h 
and the mean bid volumes Vs and V£> we can easily simulate Vs, n +i,h(P) and Vo, n +i ,h(P) by 
equations (10) and ([TTJ) given the volume forecast X ^ n+ , h and X^ n+1 h of price classes from the 
time series model. These simulations can be utilized to construct forecasts. If we only want to 
receive point estimates for Vs, n +i,h{P) and VD, n +i,h(P ) we recommend to set Rs(P) and Rd{P ) 
to one, if fts,n+i,h{P) and ftD,n+i,h(P) are greater than a certain threshold and to zero otherwise. 
For our purpose we will consider the probability threshold of 1/12. So in our point forecasts a 
price is active if it occurs in average at least twice a day. For probabilistic forecasts we utilize 
the bootstrap samples X 1 ^ n+x h . For any bootstrap sample X^ n n+] h we can reconstruct the 
prices bidding structure using (10) and ( flTj ) as well. As we assume independence between the 
Bernoulli random variables and the error term, we simply draw from the underlying Bernoulli 
distributions independently for each bootstrap sample. 

Similarly to equation ([2]) and Q we can calculate the supply and demand volumes Sd,h(P) 
and Dd,h(P) associated with the price curves given the volumes Vs,d,h{P ) and Vo,d,h(P) f° r the 
full price grid P by aggregating 


s d , h (P) = 


P€$S,d,h 

P<P 


Vs,d,h(p) for P E Ps,d,h and D d ,h(P) = 


P^D,d,h 

p>P 


V D ,d,h(p) for P E Pd^/i, (12) 


where ¥ Sl d,h = {P E P|i?s(P) = 1} and P Did ,h = {P E P| Rd(P) = 1} are the sets of recon¬ 
structed bid prices. As for the sale and purchase curves in (J2| the reconstructed points of the 
curve (12) must be interpolated linearly to receive the fully reconstructed supply and demand 


curve. The intersection of the reconstructed sale and purchase curves Sd,h and Dd,h provides the 


cet clearing volume and price. 

illustrates the resulting reconstructed supply and demand curves S^.h and D d h 


12 


required mar 
In Figure 

with its plain price class approximation counterpart for a selected example. As mentioned, in the 
main price region around 20 to 50 with many bids the difference is marginal. But in uncommon 
price regions, e.g. around 0, the impact is larger. I 11 general the reconstructed price curves look 
much more realistic than the grouped versions. 
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(a) -500 to 3000 



Volume in MW 
(b) -20 to 100 


Figure 12: Example of price class approximation and reconstructed price curves Sd,h and D d h 
for selected price ranges. 


4. Empirical results 

In order to show the results of our X-Model under real world conditions, we performed 
an rolling window out-of-sample study for the time period from 01.11.2014 to 19.04.2015. To 
evaluate our results, we compare our model with the results of standard models and models 
used frequently in the literature. Additionally, we show a detailed forecasting analysis for three 
days namely the 19.12.2014, 24.03.2015 and 12.04.2015. We chose those days for the following 
reasons. The Erst day is suitable to show how price clusters can be predicted. The second day 
and third day are these days in the selected out-of-sample data range with the largest positive 
and negative price spike respectively. All in all, these days are also suitable to show all important 
features of the model, even though they are far from having the best point forecast performance. 
The detailed forecasting results of all considered days can be found in the appendix. 

For estimation and forecasting we use for all days in the previous 730 = 2 x 365 days (2 
years) of data. Note that as we consider a rolling window forecasting study with re-estimation, 
all objects like the estimated price classes Cg and as given in Table [T| vary in the out- 
of-sample period. We forecast the supply and demand curves and compute the corresponding 
market clearing price and volume as described in the previous section. For receiving probabilistic 
forecasts we perform residual based bootstrap with a bootstrap sample size of B — 10000. 
First, we will discuss the forecasted results for the market clearing price and volume of the 
beforementioned three selected days. This is followed up by the results for the forecasted price 
curves of some hours of the 12.04.2015. Finally we will show an out-of-sample forecasting study 
for market clearing price over the whole forecasting period 

For comparing our results regarding the probabilistic forecast, we consider two benchmarks. 
As a simple benchmark we take the weekly persistent model, sometimes called naive model, 
given by 


Apric e,d,h A price/:/—7 


,n i '-a,/ 


Furthermore, we take a more advanced regime switching model that is in principle able to cover 


price spikes. The model, is very close to the one used in Karakatsani and Bunn (2008). It is a 
Markov switching model and is given by 


X, 


price 


,d,h — ^d,hb S (d,h) + £ s(d,h),h with £ s (d,h),h ~ A/"(0, <J 2 


s(d,h),h) 


(14) 
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with. X r /_/, ( 1 ) A price, d—l,hi price,d— 7,hi A price,d— 1 j A generation, d,hi wind,d,hj ^-solar,d,/i); pHT'Ml 1 K't d‘ V6C 

tor b s (d,h)i transition probabilities p hJ = P(s(d,h ) = i\s(d — 1, h) = j ) and s(d, /i) as the latent 
regime at day d and hour h with s max possible states. Here ^ pr i ce ,d-i is the mean price of the 
last day. Note that the solar component is not included for the hours from 0:00, 1:00, 2:00, 3:00 
and 23:00 as there is no solar energy produced during night. We estimate the regime switch¬ 
ing model with s max = 2 regimes by maximizing the likelihood with the EM-algorithm. For 
all benchmarks we consider the same amount of data as used for the X-Model estimation and 
forecasting, namely always two years. 

The results of the X-Model for the market clearing price and the volume of the three chosen 
days are given in Figure [13} The price forecast of the two benchmarks are given in Figure 


14 Both figures provide probabilistic forecasts for the quantiles ranging from 0.1% to 99 


which can be regarded as prediction intervals. For the volume forecasts in 13 we see no special 
behavior, the prediction intervals seem to map the daily pattern well. The observations lie 
all clearly within the 95% prediction bands. Thus, it seems that the method provides reliable 
forecasting results for the volume of these days. 

More interesting are the results of the X-Model for the market clearing prices, in Figures [13b} 


13d and 13f There we observe the distinct non-linear behavior of the prices. For small (especially 


negative) prices in 13b and 13f we see clearly left skewed prediction densities. Similarly we have 
noticeable right-skewed prediction densities for large prices as in|13d[ Therefore, the information 


of previous auction data seems to capture the increased likelihood of extreme price events very 

well. 

In Figure 13b we observe that the beforementioned price clustering at different integer price 
levels, e.g. at 0, can be modeled by this forecasting method. For the first four hours of the 
day the three point forecast for the electricity price were extremely close to zero. So it was 
relatively likely to receive values at the price cluster around 0. And indeed, the three market 
clearing prices were in this price cluster, namely 0.05 at 1:00, 0.02 at 2:00 and 0.07 at 3:00. In 
general, we can observe possible price clusters in Figure [13} They are at those spots, where the 


transition between the colors of the legend changes abruptly. For example in Figure [13b] at the 
price cluster at 0 at 2:00 there is an abrupt color change from cyan to ultramarine and another 
cluster at -50 with color change from red to yellow. 

The forecast plot m for the 12.04.2015 is also suitable to highlight the difference between 
common statistical outliers, i.e. random events that can happen, but are extremely rare, and 
price spikes that are predictable in the sense that the probability for such an event is relatively 
large given the available information. The 12.04.2015 was a Sunday, one week after the Easter 
holidays. But the 12th April opened with a clearly negative price of -14.47 at 0:00 and reached 
values between -79.94 to 31.93 during the day. The prices of the past week were all in the range 
of 12.00 to 69.03 with the last observation on 11.04.2015 23:00 at 22.11. Thus, it is usually 
very complicated to forecast a realistic likelihood for such negative prices with an autoregressive 
approach. However, our X-Model, which focuses on auction data, seems to have recognized 
the pattern within the data and provided a realistic confidence interval nonetheless. Regarding 
the prediction bands in 13f we see clear changes over the day. It starts quite narrow at 0:00, 
becomes significantly wider and more left-skewed at around 5:00. This peaks at 14:00 where 
the observed price also reaches its daily minimum. Afterwards the prediction intervals become 
smaller and more symmetric as the forecasts moves closer to common price levels. However, for 
the first hours of the day the negative prices are not predicted by the X-Model. Thus, we have 
classical outliers. The benchmark models in 14e and 14f| suffer from the same problem. However, 
it is remarkable that the X-Model predicts a quite large probability for negative prices for the 
morning and afternoon hours, especially from 13:00 to 15:00. For instance, the clear negative 
price at 13:00 with -65.06 lies clearly within the 99% prediction intervals. Both benchmark 
models in Figure 14e and 14f were not able to predict these price spikes well. Many standard 
electricity price models only allow for errors where the shape of the density does not depend 
on the predicted value. As in the persistent model the shape of the prediction density is kept 
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(c) Volume forecast for 24.03.2015 
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(d) Price forecast for 24.03.2015 
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(f) Price forecast for 12.04.2015 


Figure 13: Probabilistic volume and price forecast of the X-Model with point estimate (black line) 
and observed values (colored dots) with legend for the 19.12.2014, 24.03.2015 and 12.04.2015. 
The observed prices are colored as in Figure |5j 


constant and simply gets shifted and scaled over time. Thus they are definitely not suitable 
to capture the real underlying behavior. In contrast, the regime switching model in 14f 


is m 


general able to cover price spikes, as the forecast density is a mixture density. We see that at 
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(a) Persistent model for 19.12.2014 
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(b) Markov-Chain-Switching for 19.12.2014 
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(d) Markov-Chain-Switching for 24.03.2015 







(f) Markov-Chain-Switching for 12.04.2015 


Figure 14: Probabilistic price forecast of considered benchmarks with point estimate (black line) 
and observed values (colored dots) with legend for the 19.12.2014, 24.03.2015 and 12.04.2015. 
The observed prices are colored as in Figure |5j 


14:00 and 15:00 the prediction density becomes left skewed, which provides clear indication for 
a price spike. However, the magnitude is not well predicted. 

The largest weakness of all models known so far that are designed for modeling such price 
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spikes is that they use only the information of the observed past market clearing prices and 
related processes like wind and solar energy. The amount of historical extreme prices, which are 
considered by most common models, is typically very low as they occur only rarely. Hence, such 
models often simply have too little data points to learn from the behavior at these price levels. 

The X-Model on the other hand uses the bidding information from all time points in all price 
regions. Thus, it can learn a lot about the price behavior in every price region, even for market 
clearing prices, which were never realized so far. 

In general, Figure [13] shows that the X-Model adopts the non-linear shape of the price curves 
and hands it down to the forecasts. This automatically adjusts the shape of the prediction 
densities. 

In Figure 15 the coverage probability of all out-of-sample results is visualized. Each bar 
represents a different 1% quantile, whereas the color of the bar matches the specific quantile 
as shown in e.g. Figure [l3j The ordinate represents the observed amount of values which fell 
into a specific estimated quantile divided by the theoretical amount of values of that quantile. 
If the values for the quantiles were all estimated perfectly, the bars would in our case all have 
a value of 1. Nevertheless, we observe that the low and high probability regions around 0 and 
1 (especially the yellow to red colored regions) are clearly overrepresented, indicated by their 
values of greater than 1.5. This suggests that the X-Model may estimate too conservative, as 
it forecasts extreme events with a too small probability. For the quantile areas around 20% to 
60% we observe an underrepresentation. 



0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

Quantile 


Figure 15: Histogram of the empirical coverage of the X-Model with uniform distribution (dashed 
line). The colors for the quantile match these in Figures (13) and (14). 


Moreover, we are able to perform forecasts and compute prediction intervals for the full price 
curves. In Figure [16] we exemplarily plot the forecast for the four selected hours that we discussed 
in the introduction. Figures 16a and |16b show a forecast at 12:00 and 13:00 where the realized 
price dropped from -4.96 to -65.06. In Figures 16c and 16d we have the price curves at 19:00 
and 20:00 where the market clearing price increased from 27.92 to the hightest value of that day 
31.93. Remember that in the 12:00 and 13:00 case in 13f the prediction densities of the market 
clearing price were highly left skewed and in the 19:00 and 20:00 case relatively symmetric. Both 
graphs of 16 show additionally to the forecasted price curves with its prediction intervals the 
realized supply and demand curves of the actual auction. Note that we only show the most 
relevant price region between -100 and 150. 

For the 13:00 case with the clear negative price, the observed demand and supply curves lie 
within the relatively narrow 90% prediction bands of both curves. But this does not mean that 
the market clearing price lies in the 90% prediction interval as well. The reason is that both 
confidence intervals have a complex dependence structure. In fact, the observed price in 16b 
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Figure 16: Supply and demand curves forecasts with prediction bands for the 12.04.2015 and 
selected hours. 


lies only in the 99% prediction interval in Figure 13f However, we can see quite well that the 
predicted intersection is at a region where both, the supply and demand curve, have a relatively 
large absolute slope. The magnitude of the slope even increases for more negative values. This is 
the reason for the clearly left-skewed prediction density, because a relatively moderate increase 
in the supply curve or decrease in the demand curve causes relatively large price movements. In 
economic terms this coincides with a situation where both the supply and the demand side is 
relatively inelastic in the negative price region. This induces high price volatility. 

At 19:00, where the market clearing price turned out to be relatively high, the prediction 
intervals look similar in general, but also have important differences in the detail. Here the 
demand side is still elastic at the region close to the market clearing price. However, for a price 
level of around 30 to 40 the demand side also seems to loose elasticity quite dramatically. The 
supply side is still quite elastic up to a price region of 50 to 60. Small volume shocks in the 
bidding structure are therefore likely to be compensated by the elastic supply resulting in a high 
likelihood of small price changes to occur. However, for medium to large volume shocks the 
intersection might be shifted out of the quite elastic area of the demand curve. This is detected 
by our model and indicated by a large volatility with a clear right-skewness of the confidence 


interval, as can be also obtained from Figure 13b 


Now we want to compare the point forecast of the proposed X-Model in the out-of-sample 
region from 01.11.2014 to 19.04.2015 with several common benchmarks. Even though the model 
is primarily designed to detect and model extreme price events with the corresponding prediction 
densities, it is interesting to see the performance purely based on standard error measures in 
comparison to other established electricity models. 

Denote X pric e ,d,h the predicted point forecast of a electricity price model at day d and hour 
h that corresponds to X pTicedjh . Further we denote by D the set of all days from 01.11.2014 to 
19.04.2015, except for the 29.03.2015 which we ignore here as it is the day where the time was 
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switched due to daylight saving time. So D contains in total #(T>) = 169 days. We define the 
common error measures, e.g. the absolute mean absolute error (MAE/,) at hour h and the root 
mean square error (RMSE/,) at hour h by 


MAE/, 

RMSE/, 



Both measures are suitable to compare point-forecasts of different models at a certain h. Simi¬ 
larly to the MAE/, and RMSE/,, we define the overall MAE and RMSE by 


MAE 

RMSE 


23 


24#(E>) 


EE I A price,d,/i. -^price,d,h | > 


d£T> h =0 


23 


\| 24#(P) 


EE i*. 

d£T) h =0 


price, d,h -^-price, d,h 


In general the MAE is more robust than the RMSE, as the latter is by far more sensitive to 
outliers. 

The first two benchmarks we consider are the persistent model (Persistent) given in equation 
(13) and the regime switching model as presented in equation (14) with s max = 2 (Regime). The 
next simple benchmark that we consider is a very powerful one in terms of MAE. It uses differ¬ 
ent information as our model, namely the electricity price from the Energy Exchange Austria 
(EXAA). This is an electricity price for Germany and Austria with the same zones for physical 
settlement as the German and Austrian EPEX spot price. It is traded everyday at 10:12 and 
the prices are known at 10:20 for market participants, which means they are especially known 
in advance to the EPEX auction at 12:00. 
estimator X v dh = X djh 


Ziel et al. (2015b) show that the very simple naive 


with X|J AA ’ pnce 


as EXAA electricity price at day d, and hour h 
is very competitive. However, the EXAA benchmark model (EXAA) is basically beyond the 
competition, as it uses information which we did not explicitly include in our X-Modcl. But 
still, it can help to gain insights about possible improvements. Furthermore, we introduce two 
AR(p) based models, namely a univariate on A" pr i ce ,t (AR(p)) and a 24-dimensional model with 
24 simple univariate AR models on X d ™ e for each hour h (24-dim. AR). They are formally 
defined by 


p 

Aprice,i = 00 T E 0fcAp ric e,t-fc + £t with £ t ~ A/”(0, fT 2 ), 

k =1 
Ph 

Aprio e,d,h 0/i, 0 T ^ ^ Aprice,d— k,h T £d,h with £d,h ~ A/"(0, U/(). 

k =1 


We estimate the AR models by solving the Yule-Walker equations. The optimal orders p and 
Ph are determined by minimizing the Akaike Information Criterion (AIC) on a grid of possible 
orders. For the univariate AR model we search the optimal p on {1, 2,..., 700} which allows for 
dependencies of more than four weeks. For the 24-dimensional model the optimal order ph is 
searched on (1,2,..., 50}, which allows for a memory of up to seven weeks and one day. 

Furthermore, we consider two more models from the literature, a wavelet based model and 
a more advanced time series approach. The wavelet based approached is basically the popular 
wavelet-ARIMA model introduced by Conejo et al. (2005). We use Daubechies 4 wavelet decom¬ 
position and model the coefficients of the wavelet decomposition by an ARIMA(12,1,1). The 
second benchmark model is a time series based approach that is analyzed by Keles et al. (2012). 
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We select the ARMA(5,1) model with a trend component as well as their sophisticated annual, 
weekly and daily seasonal components. The model is suggested as one of the best models in the 
comparison study by Keles et ah (2012). We refer to the two models as Conejo et al. and Keles 
et ah respectively. 

The estimated MAE and RMSE values of all considered models with their estimated standard 
deviations are given in Table [2j The hourly MAE^ and RMSE/, for all models are visualized in 
Figure [17] 


Models 

MAE (std.dev.) 

% of persistent 

RMSE (std.dev.) 

% of persistent 

X-Model 

4.35 (0.076) 

40.8 

6.46 (0.217) 

44.3 

Persistent 

10.66 (0.159) 

100.0 

14.60 (0.240) 

100.0 

Regime 

8.83 (0.117) 

82.9 

11.60 (0.197) 

79.5 

EXAA 

3.26 (0.065) 

30.6 

5.23 (0.303) 

35.8 

AR(p) 

5.91 (0.090) 

55.4 

8.25 (0.222) 

56.5 

24-dim. AR 

6.96 (0.103) 

65.3 

9.55 (0.219) 

65.4 

Conejo et al. 

8.02 (0.112) 

75.3 

10.72 (0.213) 

73.4 

Keles et al. 

7.11 (0.099) 

66.7 

9.53 (0.219) 

65.3 


Table 2: MAE and RMSE in EUR/MWh of the X-Model and several benchmark models 




o X-Model x Regime —a— Conejo et al. 
a Persistent AR(p) * Keles et al. 

—^ EXAA 24-dim. AR 


(a) MAE/, 


(b) RMSE h 


Figure 17: MAE/, and RMSE/, for h e (0,..., 23} for the considered models. 


There we observe that the proposed X-Model performs surprisingly well, even though does 
not directly model the electricity price. With an MAE of 4.35 and RMSE of 6.46 it clearly 
outperforms all considered models, except the EXAA model with an MAE of 3.26 and RMSE 
of 5.23. In the night hour from 0:00 to 5:00 as well at 23:00 the X-Model seems to be at the 
same error magnitude as the EXAA-model and sometimes outperforms every other model under 
consideration. 

The out-of-sample MAE proportion of the X-Model in comparison to the persistent model 
is about 40.6%. The second best model which uses the same information as the X-Model is 
the AlC-selected univariate AR with a relative MAE proportion of 55.4%. Here the MAE is in 
absolute value 1.58 larger than the MAE of the X-Model. 


5. Summary and conclusion 

We present a model for the day-ahead electricity spot price by directly modeling the supply 
and demand curves. We call our model the X-Model, as we estimate the market clearing price 
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as the intersection of the sale and purchase curve of the German-Austrian day-ahead electricity 
market of the EPEX. Simple dimension reduction techniques and high-dimensional statistical 
methods allow us to deal with the huge amount of bid data. We group the possible bid prices 
to price classes and assume a linear model for the bid volume for each price class. Afterwards 
we forecast the bid volumes in the price classes, reconstruct the sale and purchase curves and 
receive the corresponding market clearing price. 

Our empirical results show that it is possible to model the electricity prices using such an 
approach in a very promising way. We can capture known stylized effects of the electricity price, 
like daily and weekly seasonalities, very well and are also able to model the newly elaborated 
stylized facts of price bids. The complex bidding structure for day-ahead prices allows us to 
model and predict extreme and rare price events by estimating realistic prediction densities for 
the market clearing price. The conducted out-of-sample study shows that the introduced model 
clearly outperforms standard methods and even very well performing methods of the recent 
literature in terms of densities as well as error measures like MAE and RMSE. Especially the 
latter was stunning and remarkable to us, as the model approach is relatively simple in its core 
and mainly developed for the purpose of modeling extreme price events. 

The provided X-Model approach opens the door to many other different applications, es¬ 
pecially those related to policy making. One very important issue is for example the impact 
of market regularizations. Many countries provide subsidies for renewable energy. This causes 
automatically the so called merit-order effect on the corresponding electricity markets. There 
are many papers (e.g. 


Sensfufi et al. 

(2008 


McConnell et al. 

(2013 

), 

Cludius et al. 

(2014) 


Dillig et al. (2016)) that aim for estimating these effects. With a sell and purchase curve based 


approach we can directly model the impact of the renewable energy. The only condition which 
must be met is the availability of data, like for the German and Austrian market. The advan¬ 
tage is that the sale and purchase curve based approach directly takes into account the market 
behavior with all its complex dependencies and non-linear properties. Common model approach 
are hardly able to cover such behavior. 

Another important application could be the evaluation of the price effect by closing a power 
plant, e.g. due to a phase-out of nuclear or lignite based power plants. By proposing just a 
few assumptions for the bidding behavior and the fuel costs it is possible to postulate a proper 
model for the electricity market. This would allow the researcher to get realistic price forecasts 
that can be utilized from decision-makers. Note that such forecasts could be achieved for more 
than one day ahead. Given a proper model design even long run studies of several months and 
years are possible. This could be combined with different scenarios for related indicators like 
GDP growth or fuel costs. 

Moreover, the paper with its proposed model can support the dialog of two model disciplines 
in electricity price modeling. At the moment there are classical statistical, time series and 
machine learning techniques that forecast the market clearing price based on observations and 
related time series. The other model approaches are mainly fundamental or multi-agent based 
electricity price models, which analyze the electricity market from a theoretical point of view 
and usually ignore real auction data. Even though both disciplines may differ in their targeted 
goal of e.g. forecasting the electricity price versus understanding the market relationships, they 
have a major similarity. Overall they are both modeling the electricity price and aiming to 
approximate it as close as possible - they just take different perspectives on it. Our approach, 
which is indeed based on econometric approaches, took one step towards the fundamental way of 
modeling and was therefore able to gain new insights which were crucial for our approach. Hence, 
we are convinced that this paper may provide a good starting point for increased communication 
between representatives of both model disciplines. 

Future research should also improve the considered model for the time series of bids. First of 
all our simplification of linear interpolating the bids without explicitly including more complex 
bids like block bids should be replaced by a more realistic algorithm which provides a closer 
approximation to the EUPHEMIA (see e.g. Dourbois and Biskas (2015)). In this sense, the 

















market coupling in Europe as well as the influence of import and export to electricity prices 
can be incorporated as well (see e.g. Wehinger et al. (2013)). Also, more investigation could 


also be done in terms of optimizing the way of classifying the bids. Applying other methods of 
dimension reduction techniques for the bid data might grant great improvements as well. 

Another important issue concerns other relevant data used for modeling the bid volume of 
the price classes. For example, we ignored so far the impact of public holidays. On holidays 
like Christmas Eve, Christmas Day or New Year’s Day the model performs relatively poorly. 
Here improvement is relatively easy possible. Moreover, the inclusion of market price time series 
of different markets like the intra-day price as well as auction results of related markets, such 
as those from neighboring countries, could be beneficiary for the model quality. Other useful 
regressors could be different fuel costs or C0 2 allowances. Also the restructuring procedure that 
was used for mapping the local price behavior provides a lot of space for further improvement. 
The probabilities that a certain price is traded or not could be modeled time-varying. 
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